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Abstract 

Point configurations have been widely used as model systems in condensed matter physics, 
materials science and biology. Statistical descriptors such as the n-body distribution function gn 
is usually employed to characterize the point configurations, among which the most extensively 
used is the pair distribution function g2. An intriguing inverse problem of practical importance 
that has been receiving considerable attention is the degree to which a point configuration can be 
reconstructed from the pair distribution function of a target configuration. Although it is known 
that the pair-distance information contained in 172 is in general insufficient to uniquely determine 
a point configuration, this concept does not seem to be widely appreciated and general claims 
of uniqueness of the reconstructions using pair information have been made based on numerical 
studies. In this paper, we introduce the idea of the distance space, called the D space. The pair 
distances of a specific point configuration are then represented by a single point in the B space. 
We derive the conditions on the pair distances that can be associated with a point configuration, 
which are equivalent to the realizability conditions of the pair distribution function §2- Moreover, 
we derive the conditions on the pair distances that can be assembled into distinct configurations, 
i.e, with structural degeneracy. These conditions define a bounded region in the D space. By 
explicitly constructing a variety of degenerate point configurations using the D space, we show 
that pair information is indeed insufficient to uniquely determine the configuration in general. 
We also discuss several important problems in statistical physics based on the D space, including 
the reconstruction of atomic structures from experimentally obtained g2 and a recently proposed 
"decorrelation" principle. The degenerate configurations have relevance to open questions involving 
the famous traveling salesman problem. 

PACS numbers: 05.20.-y, 61.43.-j 
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I. INTRODUCTION 



A collection of a finite or infinite number of points in d-dimensional Euclidean space M.'^ 
is called a point configuration. Point configurations are one of the most popular and widely 
used models for many-particle systems in various branches of modern science, including 
condensed matter physics and materials science I2, 1^, 0, Is], statistical mechanics 6, 7, ^, 
discrete mathematics (packing problems) 9|, astrophysics (distribution of galaxy clusters) 



lQ,[ll|, ecology (tree distributions in forests) [l2| and biology (various cellular structures) 
131] . Point configurations can exhibit a variety of degrees of disorder, from the most random 



Poisson distribution (21 to a perfectly ordered Bravais lattice hi. The degrees of disorder 

nn 

can be quantified by discriminating order metrics |1J] , which, in their simplest forms, are 
scalars and normalized such that the most disordered system is associated with zero and the 
most ordered ones with unity. 

In most circumstance, it is impossible and even unnecessary to acquire detailed knowledge 
of all positions of the points in the configuration. Instead, statistical descriptors such as 
distribution functions are typically employed to characterize the point configurations. In 
particular, the n-body distribution function 5'„(xi, X2, . . . , x„) is related to the probability of 
finding a genenc confign.at.on of n points at positions x„ x.. . . . , x„. It is well known tltat a 
set of n-body distribution functions gi, g2, . . . , gn [2] is required to statistically characterize 
an n-point configuration completely. As n ^ 00 in the thermodynamic limit (e.g., the 
volume V which the n points occupy also increases to infinity such that the number of 
points per volume - number density p = N/V - is a well defined finite number), the set 
contains an infinite number of correlation functions. For statistically homogeneous systems 
which is the focus of this paper, gn is translationally invariant and hence depends only on 
the relative displacements of the positions with respect to some chosen origin, say xi, i.e., 
(7„(xi, X2, . . . , x„) = 5'„(xi2, Xi3, . . . , xi„) with Xjj = x^ — Xj. Thus, the one-body distribution 
function gi is just equal to the number density p. The important two-body quantity (72(xi2) 
is usually referred to as the pair distribution function. In the statistically isotropic case, 
g2 is a radial function, i.e., 5'2(xi2) = 5'2(|xi2|) and it is also called the radial distribution 
function. The radial distribution function which is one of the most widely used structural 
descriptors, essentially provides the distribution of the point-pair separation distances and 
can be obtained experimentally via scattering of radiation The three-body function g^ 
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contains information about how the pair separations involved in g2 are hnked into triangles. 

It is worth noting that by decorating the points in the system (e.g., letting equal-sized 
spheres be centered at each point), one can construct a two- phase random texture from 
a given point configuration. In general, there is an infinite number of ways to decorate a 
point configuration. In the characterization of random textures, the analog of the n-body 
distribution functions are the n-point correlation functions S'„(xi, X2, . . . , x„) [2^, which gives 
the probability of finding n points at positions xi,X2, . . . ,x„ in the phase of interest. In 
general, a complete statistical characterization of a continuum random texture requires 
an infinite set of Sn- Though under certain conditions, Qn of a point configuration and 
Sn of the associated decorated random texture might convey the same level of structural 
information (in fact the associated Sn can expressed as functional of Qn given the details of 
the decorating phase [2I), the former evidently refiect the essential geometrical features of 
the point configuration more directly. 

An intriguing inverse problem that has been receiving considerable attention is the re- 
construction (or construction) of realizations of a many-body system (essentially a point 
configuration) that match the prescribed structural information of the system in the form 



of g2 or 5*2, obtained from either experiments or t 
elude the reconstruction of random media 



suspensions 
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leoretical considerations. Examples in 
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231] and colloidal 



241 ]. investigation of the iso-5'2 process [25| or 5f2-iiivariant processes and the 
realizability conditions of [26 ] as well as the more recent discovery of unusual disordered 
classical ground states [27]. X-ray scattering techniques have been an indispensable tool 
historically in the study of the structures of crystalline matter, and it has been generalized 
to probing disordered media [l|. In particular, the pair distribution function (72 (r) is ob- 
tainable from the Fourier transform of the structure factor S(k) |l|, which is proportional 
to the scattering intensity (with the atomic structure function removed) and can be directly 
measured in experiments. With the obtained g2, one can then employ various reconstruc- 
tion techniques to generate realizations of the system of interest. Another related family 
of inverse problems is the reconstruction of pair interaction potential from a given radial 
distribution function g2{r) between particles, i.e., the inverse Monte Carlo problems 28]. 

It is known that though the information contained in g2 can be sufficient to completely 
characterize ordered point configurations in very special circumstances |i29] it is generally 
devoid of crucial structural information to uniquely determine a disordered point configura- 
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tion 130 



31 



321 ] ■ However, it seems that this aspect has not been widely appreciated and 



general claims of u niq ueness of the reconstructions using g2 or 5*2 have been made based on 
numerical studies 18|, |23| . One aim of this paper is to show via a variety of examples the 



existence of distinct point configurations with identical pair-distance distributions (e.g., g2)-, 
which implies the non-uniqueness of the reconstructions involving g2 of these point configu- 
rations. Besides, general mathematical formalism to characterize the structural ambiguity 
of pair information is also devised. 




(a) (b) 

FIG. 1: (color online). An example of two-dimensional four-point configm'ations possessing two- 
fold degeneracy: (a) A "kite", (b) A "trapezoid". The specific distance sets are a = (2x^ — 33;-|-|) 2 , 
b = {2x'^ — X + , c = 2x — 1, (i = 1, for ^ < X < 1. For x > 1, the outer boundary of the "kite" 
is no longer a quadrilateral but reduces to an isosceles triangle. 

Figure [1] shows two distinct configurations of four points in two dimensions with identical 
pair distances. In particular, one configuration (with the pair distances shown) resembles a 
"kite" and the other resembles a "trapezoid" . In order to provide an in-depth presentation 
of the ambiguity of pair-distance distributions, it is necessary to exam the problem mathe- 
matically first and then discuss the physical implications. Some definitions are in order here. 
Two d-dimensional statistically homogeneous and isotropic n-point configurations F^^ and 
F;^^ are identical if and only if they possess identical sets of A;-body distribution functions 
Qk for k = 1,2, ... ,n. The configurations F^„ and F;^^ are (^fc-distinct if and only if they 
process distinct n-body distribution functions for all n > k. A (i-dimensional n-point con- 
figuration F^ „ is /c-fold degenerate if and only if there exist additional {k — 1) rf-dimensional 
ra-point configurations F^ „ {i = 2, . . . , k) that are mutually (73-distinct and also ^fs-distinct 
from F^„, all of which possess the same two-body distribution function g2. This definition 
of structural degeneracy rules out the possibihty that two degenerate point configurations 
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are trivially connected by translation, rotation, mirror reflection or any of their combina- 
tions. Moreover, we consider that two point configurations are equivalent (i.e., do not form 
a degenerate pair) if they are related by a trivial isotropic rescaling, which does not change 
the internal structure of the configuration. Thus, we see that the "kite" and the "trapezoid" 
are associated with the same set of pair distances (i.e., they are two- fold degenerate), but 
the triangle information of the two is distinguishable 3J]. It is worth noting the histori- 
cally prominent Kirkwood superposition approximation of gs which replaces the three-body 

. Because the 



distribution function with a product of three pair distribution functions 
separate members of our pair-distance degeneracy examples present distinct triangle (i.e. 
three-body) distributions, the conclusion must be that no functional of g2 (Kirkwood or 
otherwise) can uniquely specify g^. 

It is clear that given g2 associated with the degenerate point configurations, it is impos- 
sible even in principle to obtain a unique reconstruction, and each degenerate configuration 
should be recovered with equal probability. Therefore, an outstanding problem is to deter- 
mine under what conditions the pair distance information contained in g2 could uniquely 
determine a point configuration, i.e., there is no associated structural degeneracy. A ques- 
tion with more practical importance is that how the point configurations would change when 
the measurement of g2 is subject to slight imprecision, a common situation in experiments 
and numerical simulations. 




(a) 



(b) 



(c) 





(d) 



(e) 



FIG. 2: (color online). The three distinguishable circuits for the "trapezoid" (upper panel) and 
two distinguishable circuits for the "kite" (lower panel). The circuits are shown in thick red lines. 
The circuit shown in (a) is the shortest route among all possibilities. 
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In addition to their physical relevance, degenerate point configurations are also of math- 
ematical interest. For example, open questions connected to the famous traveling salesman 
problem 33| can be raised: Given the degenerate configurations associated with the same 
set of pair distances, what are the optimal solutions of the traveling salesman problem for 
each configuration and are they unique? Are there special degenerate configurations whose 
solutions are identical? For the simplest "kite-trapezoid" example shown in Fig. [1], for the 
parameter values x > 1/2 the "trapezoid" has three distinguishable circuits and the "kite" 
has only two (see Fig. [2]). The shortest route among all is presented by the "trapezoid", i.e., 
a closed circuit visiting each vertex once and only once. For x = 1/2, both the "kite" and 
the "trapezoid" collapse onto a line segment, and in that limiting case all circuits have the 
same length. For more general and complicated degenerate configurations, such questions 
are notoriously difficult to solve; the problem belongs to the NP-complete class. 



/\ c 




(a) (b) 

FIG. 3: (color online), (a) A three-point configuration (i.e., a triangle) in R^. (b) The region of 
feasible distances in the D space (bounded by the blue planes). The three pair distances of the 
triangle shown in (a) is represented as a point (red spot) in the D space. 

In this paper, we introduce the idea of the distance space (i.e., the D space), in which 
each dimension is associated with the separation distance between a given point-pair. The 
pair-distance distribution of a particular point configuration is then presented by a single 
point in the © space. It is clear that not all the points in © space correspond to realizable 
configurations, i.e., the separation distances have to satisfy certain conditions such that they 
could be assembled into a point configuration. These conditions together define a (partially) 
bounded region in the D space. For example, for three-point configurations in (i.e., tri- 
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angles), the region of the feasible distances is an open "pyramid" in the three-dimensional 
D space, as shown in Fig. [31 When degeneracy exists the region of the feasible distances is 
generally a complicated closed intersection of several such simple curved "pyramid" in high 
dimensions. The determination of the region of feasible distances is equivalent to obtaining 
the conditions of a realizable g2-, i.e., a pair distribution function that can be associated 
with a point configuration. Using the D space, we will answer various aspects of the afore- 
mentioned questions concerning the degenerate point configurations and the non-uniqueness 
issue of the reconstruction. We will show that the utility of the D space also improves our 
understanding of various important problems in statistical physics such as the recently pro- 
posed decorrelation principle in high- dimensional Euclidean space 36|]. In a sequel to this 
paper, we will extend the present analysis to understand degeneracy issues pertaining to 
heterogeneous materials, which is a larger classification than point configurations [s?]. 

The rest of the paper is organized as follows: In Sec. II, we discuss the D space in 
detail and derive the conditions for feasible distances and for the occurrence of degeneracy, 
through which we show that in general degeneracy is rare. In Sec. Ill, we provide a variety 
of examples of degenerate point configurations and illustrate how the conditions derived 
in Sec. II could be employed to construct point configurations with specific degeneracy. 
In Sec. IV, we discuss several problems in statistical physics such as the reconstruction of 
atomic structures from experimentally obtained g2 and the decorrelation principle, based on 
the idea of the © space. Finally, we make concluding remarks. 



II. THE DISTANCE SPACE D 



In this Section, we will discuss in details the D space. In particular, we will derive the 
conditions under which the pair distances could be assembled into a point configuration, i.e., 
the feasibility conditions as well as the conditions under which the pair distances correspond 
to degenerate point configurations. We will first study a four-point configuration in to 
illustrate the idea and then consider the general n-point configurations in W^. 
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A. A Simple Example: Four-Point Configuration in 



Consider a four-point configuration r2,4 in (see Fig. Hj) and the associated 6- 
dimensional D space. We would like to know the answers to the following two questions: 

(i) What are the conditions the six pair separation distances must satisfy so that they 
correspond to a four-point configuration in ? 

(ii) What are the conditions the pair distances must satisfy so that they correspond to k-fold 
degenerate four-point configurations in ? 




Pi di P2 

FIG. 4: A four-point configuration in M?. 

To answer these questions, we consider a particular construction as follows: Suppose 
the six pair distances are elements of the set Q = {di,d2, ■ ■ ■ yd^}, which can be further 
partitioned as = {Pi, P2, Ps, P4} where Pi = {$} ($ is the null set), P2 = {di}, P3 = 
{'^2, c^s} and P4 = {^4,^5,^6}. We will see that such a partition enables us to associate 
the pair distances with the corresponding points in a convenient way. Recall that from our 
definition (Sec. I), point configurations are considered identical if they are connected by 
translation, rotation, mirror reflection and any of their combinations. Thus, we can put 
point Pi at the origin of a Cartesian coordinate system and put point P2 on one of the two 
orthogonal axes of the coordinate system separated from the origin (i.e.. Pi) by a distance 
di. Note different choices of the position of Pi and the orientation of the line segment P1P2 
lead to point configurations that are identical up to translations and rotations. For point P3, 
we can either let P1P3 = ^2, P2P3 = d^ or P1P3 = d^, P1P2 = d2. The two choices correspond 
to two configurations connected by a mirror refiection, which are considered identical and 
either choice is acceptable. With out loss of generality, we choose P1P3 = d2, P2P3 = d^. 
Finally, we choose P1P4 = ^4, P2P4 = d^, P3P4 = de for point P4. 
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We see from the above construction that the positions of points P3 and P4 are determined 
with respect to the hne segment defined by points Pi and P2 as a "reference" structure. 
Note that the hne segment is a one- dimensional simplex. In M^, the position of a point is 
completely determined by specifying two distances from the point of interest to the vertices 
of a reference line segment, given that the distances involved satisfy the triangular inequality, 
i.e., the triangle formed by the point of interest and the two vertices of the reference line 
segment possess non-negative area. The area A of a triangle with edges a, b, c is related to 
the Cayley-Menger determinant 38|], i.e.. 



Thus, for point P3, we obtain 
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(1) 



4 + 4 + 4- 244 - 244 



244 > o> 



and for point P4, we obtain 

4 + 4 + 4- ^dl4 - '^44 - '^44 > o- 



(2) 



(3) 



Inequalities ([2]) and ([3]) define a partially bounded region in the six-dimensional D space, 
the lower-dimensional analog of which is the open pyramid shown in Fig. [31 The distance 
(Iq between points P3 and P4 is also completely determined hj di,d2, . . . ,d5 via 



-2dl 



{dl-dl-df, {dl-dl-dj 



4 



4) 



-24 



0, 



(4) 



32| 



(4 -4- di) -24 {d, 
{di-dj-di) [di-di-di) 

which results from the requirement that all the 3x3 minors of the Gram matrix 
involving the distances possess zero determinant. We will discuss the Gram matrix in detail 
in Sec. II. B. Equation(jl]) defines a curved hypersurface in the D space, whose intersection 
with the region defined by Eqs. ([2]) and ([3]) contains the feasible distances Vl that can be 
assembled into a four-point configuration in M^. We call Eqs. ([2]), ([3]) and (jl]) the feasibility 
conditions. Note that for the four-point configuration in R^, only five pair distances can 
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be chosen almost independently subject to the mild triangle inequality constraint. Thus 
we define the free dimension of the D space to be the number of pair distances that are 
only constrained by inequalities, which total to five here. The free dimension is also the 
dimension of the region for the feasible distances, which is also referred to as the feasible 
region. 

Now we can answer question (i) given at the beginning of this Section easily. Suppose a 
list of distances is given, when any one permutation of these distances satisfies the feasibility 
conditions [Eqs. ([2]), ([3]) and (j4])], the pair distances correspond to a four-point configuration 
in M^. However, such a simple answer does not exist for (ii). For the pair distances to corre- 
spond to /c-fold degenerate point configurations, a necessary condition is that the dimension 
of the intersection of the feasible regions for the k permutations of the pair distances is 
non-zero. In other words, each permutation of the pair distances is associated with a set of 
feasibility conditions, and a feasible region can be constructed. To obtain a fc-fold degen- 
eracy, all sets of the feasibility conditions need to be satisfied simultaneously, which is only 
possible when the intersection of the feasible regions is at least a single curve in the © space. 
For the two-dimensional four-point configuration of interest the free dimension is five, which 
leads to an upper bound on the order of the degeneracy, i.e., kmax = 5. This condition is only 
a necessary one because there are certain permutations that lead to identical configurations, 
such as those that correspond to the permutations among the point indices which do not 
change the structure of the configuration, since the points are indistinguishable. For exam- 
ple, Qi = {di, d2, ds, (i4, d^, d^} and Q2 = {di, d^, d^, ^2, ds, d^} correspond to an exchange of 
P3 and P4, which possess the identical feasible regions and thus do not contribute the the 
degeneracy. No further conclusions can be made without knowing the details of how the 
distances are permuted. In Sec. Ill we will construct concrete examples of degenerate r2,4, 
where the details of permutations are considered. 

B. General Formulation: Feasibility Conditions 

The generalization of the above formulation is straightforward. Note in the following 
discussion in this Section, we assume n > {d + 1); the case when n = d + 1 (i.e., the 
simplex configurations) are discussed in detail in Sec. III. A and the case when n < {d + 1) 
is trivial. Consider an n-point configuration r^^^ in M'^, which possesses m = n{n — l)/2 
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pair distances fl = {di,d2, ■ ■ ■ ,dm}- The distances can be further partitioned, i.e., fl = 
{Pi,P2,...,Pn}, where Pi = {$}, P2 = {di}, . . ., Pi = {d^i2-3i+3)/2, ■ ■ ■ , d^j2_iy2}, • • •, 
Pn = {d{n2-3n+3)/2, ■ ■ ■ , c^(n2-n)/2}- Following the Same construction procedure prescribed in 
Sec. II. A, the distances associated with the first d points, i.e.. Pi, P2, . . . , Pd, are assembled 
into a {d — l)-dimensional simplex as the "reference" structure. Each point Pi {i > d) is 
associated with {i — 1) distances and the position of point Pi is completely determined by 
specifying the d distances from Pi to the vertices of the reference structure, given that the 
(i-dimensional simplex formed by Pi and the vertices of the reference structure possesses a 
nonnegative volume. In particular, denote the {d + 1) vertices of the rf-dimensional simplex 
by Vj (i = 1, . . . , (i + 1), we can define a + 1) x + 1) distance matrix M, i.e., 

Mij = Mji = \\^ri-^rJ\\^, (5) 

where || ■ || denotes the L^-norm of a d-dimensional vector and Mij {Mji) is the squared 
distance between vertex i and j. The volume A of the simplex is then given by the Cayley- 
Menger determinant, i.e.. 



\d+l 



■\M\ > 



where M is a (rf + 2) x (rf + 2) matrix obtained from M by bordering M with a top row 
(0, 1, 1) and a left column (0, 1, 1)"^. For example, Eq. reduces to Eq. ([1]) in R^, 
and in we obtain 



1 

288 



1111 

1 M12 Mi3 Mi4 

1 M21 M23 M24 • (7) 
1 M31 M32 M34 
1 M41 M42 M43 

The requirement that the constructed (i- dimensional simplex possesses non-negative volume 
leads to higher dimensional analogs of the well known triangle inequalities in two dimensions, 
which we will refer to as simplex inequalities. In general, each set of the simplex inequalities 
associated with a point Pi {i > d) defines a partially bounded region in the D space, the 
intersection of which is a high-dimensional analog of the open pyramid shown in Fig. [3t^b). 
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It is clear from the above construction that a point configuration Td,n can be completely 
determined by only specifying / = [^d{d — 1) + — d)d] distances (e.g., "free" distances), 
satisfying the simplex inequalities. Thus, the free dimension of the D space is / and the 
remaining (m — /) pair distances (e.g., "constrained" distances) cannot be chosen freely 
but instead are completely determined by the / "free" distances. To obtain the relations 



between the "constrained" and "free" distances, we will employ the following theorem [32]: 

Theorem 1: For a set of n vectors Vi, V2, . . . , v„ in M.'^ (n > d), let the Gram matrix be 
defined by Gij =< Vj, Vj > where < ■ > denotes the inner product. Then all (d+l) x {d + 1) 
minors of G must have zero determinant. 



The proof of the theorem is given in Ref . [32] . It is essentially another way of stating the fact 
that there are at most d linearly independent vectors among vi, V2, . . . , v„ in a d-dimensional 
Euclidean space. Without loss of generality, we can choose the origin at vi and obtain 

Gij = (vj, Vj) = (vj - vi, Vj - vi). (8) 



Consider the identity [32| 



(V, - Vi, Vj - Vi) = i](Vi - Vi, Vi - Vi) + {Vj - Vi,Vj - Vi) - (Vi - \j,Yi - \j)], (9) 



we obtain that 



G,, = ^{dl + d'^,-dl), (10) 

where dij is the distance between the two points i and j. Thus we see the requirement that 
all {d + 1) X {d + 1) minors of G have zero determinant (denoted by M(^d+i)), i-e., 

\M^d+i)iG)\=0 (11) 

leads to fourth order algebraic equations involving the (m — /) "constrained" distances. It 
is clear that each "constrained" distance can be explicitly expressed as a function of the 
"free" distances alone. For four-point configuration in M^, Eq. (|TT1) gives Eq. (jlj). Note 
these equalities define curved hypersurfaces in the D space. The intersection of these curved 
hypersurfaces as well as the partially bounded regions defined by ([6]) gives the feasible region 
of the D space, i.e., when any permutation of the m = n{n — l)/2 pair distances lies within 
the feasible region, these distances can be assembled into an ra-point configuration in M'^. 
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C. General Formulation: Necessary Conditions of Degeneracy 



For the distances Q to correspond to fc-fold degenerate point configurations F^ ,^, F^ „, . . ., 
^dn — /)' the feasibility conditions for k distinct permutations of Q should be satisfied 
simultaneously. The feasibility conditions associated with any particular permutation of Q 
include a set of equalities, which would reduce the dimension of the feasible region in the D 
space. Suppose that for each distinct permutation only one additional equality constraint 
is introduced. Then we can obtain an upper bound on the order of the degeneracy, i.e., 
kmax = /, which corresponds to a feasible region that has been reduced to a single curve 
(with one free dimension). That is, only one distance can be chosen arbitrarily. However, 
different choices of the single free distance correspond to trivial isotropic rescaling of the 
entire configuration, which leads to no degeneracy based on our definition. Note that if the 
permutation does not introduce new feasibility conditions, it corresponds to a permutation 
of the point indices, which does not affect the structure of the configuration. 

The properties of the feasible regions have important implications. As we have seen, the 
structural degeneracy would reduce the dimension of the feasible regions, the volume of which 
is proportional to the number of feasible distance sets. For a particular n-point configuration, 
we could in principle identify all feasible distance sets by exploring the whole feasible region 
in the D space point by point. However, the distance sets associated with degeneracies can 
only lie on a hypersurface with lower dimensions than the feasible region. The volume ratio 
of the hypersurface to the feasible region, which is also the number ratio of the distance 
sets associated with degeneracies to those without degeneracies, is vanishingly small. In 
other words, although degeneracies exist they are extremely rare. This might explain why 
perfect reconstructions (identical match of the pair distances and the corifigurations up to 



translations, rotations and mirror reflections) can be obtained numerically [18|, |23| . However, 
the general conclusion that pair statistics alone would uniquely determine the configurations 
could not be made only based on those numerical results, as we will show in the next section 
via a variety of examples of degeneracy. 
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III. EXAMPLES OF DEGENERATE POINT CONFIGURATIONS 



In this Section, we construct a variety of degenerate point configurations using the general 
scheme developed in Sec. II. In particular, we study the degeneracies of simplices in W^, 
four-point configurations in and specific n-point configurations in M'^ possessing two-fold 
degeneracy. 

A. Degenerate d-Dimensional Simplices 

A simplex in is the convex hull of a set of {d + 1) points Td,d+i that do not all lie 
on the same {d — l)-dimensional hyperplane. A simplex in is a triangle and in is a 
tetrahedron. Simplices in M.'^ {d > 4) can be considered to be rf-dimensional generalizations 
of the three-dimensional tetrahedron. The simplex is so-named because it represents the 
simplest possible polyhedron in the given dimension. The volume a d-dimensional simplex 
is given by Cayley-Menger determinant 

A unique feature of simplex configurations ^+1 is that their feasibility conditions only 
include the simplex inequalities. These inequalities define a partially bounded region pos- 
sessing the same dimensions as the D space. In other words, the free dimension of the 
feasible region is not reduced due to degeneracy. Thus, one should expect that it is much 
easier to obtain highly degenerate simplices than other point configurations. 

Suppose we have a distance set Q = {di, d2, . . . , d^} {tti = d{d + l)/2). It is clear that 
if we choose di = d + 6i, where 6i {i = 1,2, .. . ,m) are mutually distinct small numbers, 
they will satisfy all the simplex inequalities and correspond to a point in the vicinity of 
the centroid of the feasible region. The maximum magnitude of the 5's depends on the 
boundaries of the feasible region, which we need not to worry about for the moment, as long 
as the 5's are sufficiently small and mutually distinct. 

In {d = 2), the three distances can be assembled into a unique triangle, i.e., we have 

(2) 

kmax = 1. This can also be seen from the following argument: since configurations connected 
by translations and rotations are considered identical, we could pick any one of the three 
distances along one of the coordinate axes starting from the origin and require the same for 
the corresponding distance of all possible degenerate configurations. In this way, we rule 
out translations and rotations in a plane. There are only two distances left, which can be 
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assembled into a triangle in two ways. However, the two resulting triangles are mirror image 
of each other. Thus, we have kmLx = 2!/2 = 1, where "!" indicates factorial. 

In (c/ = 3), we similarly choose one of the six distances as the "reference" distance, 
and the remaining five are assigned to different edges of a tetrahedron, which results in 5! 
tetrahedra. However, among these tetrahedra there are pairs that are connected by mirror 
reflection which has to be excluded. Two mirror reflection plane can be identified: one 
perpendicular to the reference distance and the other contains the reference distance. This 
further reduces the number of distinct tetrahedra by a factor of 1/4. Thus, we obtain 
ktlx = 5!/(2 X 2) = 30. 

Generally, in (i-dimensions when one of the m = d{d + l)/2 distances is chosen as the 
reference distance, there are (m — 1)! ways to assemble the remaining (m — 1) distances into 
a simplex in M'^. However, {d — 1) hyperp lanes (among which one contains the reference 
distance and the others are perpendicular to it) can be identified that are mirror reflection 
hyperplanes of the simplex. Each mirror reflection reduces the number of distinct simplices 
by a factor of 1/2. Thus, we have 

Ud) _ jm-iy. _ [d{d + l)/2-l]\ 

We can see that for simplex configurations in with (i > 3, fcmL is significantly larger 
than the dimension of the feasible region / = d{d + l)/2, which indeed implies a high level 
of degeneracies associated with these configurations. 



B. Two-Dimensional Four-Point Configurations 

We show here how the conditions determining the feasible region in D space can be 
employed to construct four-point configurations r2,4 in M? with /c-fold degeneracy. As pointed 
out in Sec. II, the feasibility conditions are only necessary and the details of how the distances 
are permuted must be considered. 

The relations of the six distances of r2,4 are given by Eq. (jl]) for the particular order 
Vt = {di,d2,d3,d4,d^,dQ}. For a permutation Q*, the variable di in Eq. (j4]) should be 
replaced by the ith element of Q*, which generally would lead to a different equation for the 
six distances. As mentioned in the last Section, we could choose di as the reference distance 
to rule out translation and rotation and only consider the permutations of the remaining 
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five distances, which gives 5! = 120 distinct equations. Without loss of generahty, we could 
also choose di = 1 which corresponds to a trivial isotropic rescaling of the entire point 
configuration. 

In principle, a fc-fold degeneracy {k < kmax = 5) could be constructed by requiring that 
the k equations for the five distances corresponding to k distinct permutations hold simul- 
taneously. However, we find that high level degeneracies (those with k close to kmax) are 
difficult to realize. In particular, when k is large the equations for the distances possess 
roots that are algebraically multiple, i.e., Q contains two or more equal valued distances, 
which leads to configurations connected by rotations and mirror reflections. Thus the num- 
ber of distinct configurations associated with the distances is smaller than k. For ri-point 
configurations, the largest k that we have realized is A; = n — 1 . Due to space limitation, we 
could not exhaust all degeneracies for each k (i.e., about C120 cases) in this paper and only 
provide a few specific examples. 



FIG. 5: (color online). An example of two- fold degenerate four-point configurations in M?. The 
distances are given by di = 1, ^2 = 1.58114..., ds = 0.70710. .., ^4 = 0.87228. .., d^ = 1.32698..., 
da = 1.54551.... 

For k = 2, requiring the distance permutations Qi = {^1,^2,^3,^4,^5,(^6} and Q2 = 
{di, d2, d^, dQ, d^, d^} to hold simultaneously yields 




ds 



(a) 



(b) 



D{Qi) = D{di, c?2, ds, di, 4, 4) = 0, 
D{Vt2) = D{di, d2, ds, de, d^, 4) = 0, 
where D{xi, X2, X3, x^, x^, Xq) is the multinomial given by 



(13) 
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D(xi, 



-2x1 



[Xn 



XI 



1-^5 -^1 



—2x2 



1^5 ^1 ^4:) 



(/y>2 ,-y>2 -T"^^ 



-2x1 



Xa 



(14) 



Equation f[T^ reduces the free dimensions of the D space from five to four. Without loss of 
generahty, we choose di = 1, d2 = 1.58114 . . = 0.70710 . . ., ^4 = 0.87228 . . ., and solve 
(fT3|) to obtain d^ = 1.32698 . . ., d^ = 1.54551 . . .. The two-fold degenerate configurations 
are shown in Fig. [51 It should be noted in passing that the "kite-trapezoid" example shown 
earlier in Fig. [T] is a special case of this four-point two-fold degeneracy, for which the shapes 
each have a reflection symmetry. If we require the permutations ^3 = {di, d2, ds, d^, d^, d^} 
and fl^ = {di, ^2, d^, d^, d^, d^} to hold simultaneously, the same degeneracy can be obtained, 
because the apparently different groups of distance permutations (^1,^2) and (^3,^4) cor- 
respond to the permutation of indistinguishable points. 




(a) 




(b) 




(c) 



FIG. 6: (color online). An example of three- fold degenerate four-point configurations in M?. 
The distances are given by di = 1, ^2 = 1.581144..., ^3 = 0.70710..., = 1.34371..., 
ds = 0.37267 . . . , = 0.68718 .... 



Similarly, for A; = 3 we choose fli = {rfi, 6/2, ^3, ^4, (is, rfe}, ^2 = {c^i, (^2, c^s, c^s, c^e, (^4} 
and ^3 = {di,d2,d2,,di,dQ,d5} to hold simultaneously, which reduces the free dimensions 
to three. By choosing di = 1, c?2 = 1.581144..., ds = 0.70710..., equations D{^li) = 
{i = 1, 2, 3) can be solved to yield d^ = 1.34371 . . ., 4 = 0.37267 . . ., 4 = 0.68718 . . .. The 
three-fold degenerate configurations are shown in Fig. [61 
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C. d-Dimensional n-Point Configurations with Two- Fold Degeneracy 



In general, the feasible region of fc-fold degenerate n-point configurations in M.'^ can be 
obtained by carrying out a similar calculation used in the previous section, which would 
be extremely tedious. However, when the point configurations possess certain symmetries, 
particular degeneracies can readily be constructed. Here we provide constructions of two-fold 
degenerate ra-point configurations in M'^ by taking advantage of their symmetries. 

Consider a centrally symmetric ni-point configuration T^^^^ in R'^, i.e., there exists a 
center Oi such that for every point Pj^^^ in F^^,^^ there exists a point Pj^\ for which the 
line segment Pj^^^Pj^^ passing Oi is bisected by Oi (note that i = j is allowed), i.e., P/^^ 
and Pj^^ are points of inversion symmetry about Oi. Consider another centrally symmetric 

(2) 

point configuration r)^„^, in which all the n2 points are distributed symmetrically on a one- 
dimensional line Z'-^'' embedded in W^. Denote the symmetry center of T^^^^ by O2. We require 
that the line segment O1O2 is perpendicular to Finally, consider the centrally symmetric 
point configuration r^2n3' points of which are also distributed symmetrically on a 

one-dimensional line l^^^ that is parallel to Z*^^-* with the symmetry center coinciding with Oi. 
^fln3 '^^^ be further partitioned into two subsets: A^^^ which contains points in r^^2n3 
such that no two points in A^^^ are symmetric about Oi (i.e., they are "primary" points); 
and A^^g which contains the remaining points of F^^gng (i-^-, the "dual" points). It is 
clear that 



d,(ni+n2+ns) ^ d,ni ^ d,n2 ^ d.ria' 
d,{nx+n2+nz) d,ni ^ ^ d,n2 ^ rf.na' 



(15) 



form a degenerate pair, i.e., the distances from the primary points to the remaining 
(721+712) points in (^ni+n2+ns) identical to those from the 713 dual points to the remaining 
(711 + 712) points in ^^^_|_^^_(_^^^, while the two resulting configurations are not connected 
by translation, rotation, mirror reflection or any of their combinations. Specific two-fold 
degeneracy examples in and are shown in Figs. [7] and [HI respectively. 

IV. DISCUSSION 

The D space concept can be applied to reconcile a variety of problems in statistical 
physics, such as the reconstruction of atomic structures from experimentally obtained g2, 
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(a) (b) (c) 

FIG. 7: (color online). An example of two-fold degenerate configurations in constructed as 
described in the text, (a) The points in T^^^^ are shown in blue, the points in T^^^^ are shown in 

(3) 

red and the points in F)^ shown as void circles, (b) and (c) shows the degenerate configuration 

pair. 




(a) (b) (c) 

FIG. 8: (color online). An example of two- fold degenerate configurations in constructed as 
described in the text, (a) The points in F^^^^ are shown in blue, the points in F^^^^ are shown in 
red and the points in F|^ shown as void circles, (b) and (c) shows the degenerate configuration 

pair. 

and the decorrelation principle, which we will discuss in the ensuing subsections. 
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A. Reconstruction of Atomic Structures from Experimentally Obtained g2 



As mentioned in Sec. I, a knowledge of atomic structures of condensed matter can be 
obtained via X-ray scattering experiments. In particular, the two-body distribution function 
(72 (^) is related to the Fourier transform of the structure factor S{k), which is proportional to 
the scattering intensities (with the atomic structure function removed). For ideal crystalline 
structures (without any thermal agitation of the atomic centers), g2 consists of a series 
of Dirac delta functions at specific distances. For disordered structures (lack of long-range 
order), g2 is generally a continuous damped oscillating function that decays to its long-range 
value very quickly. Interestingly, it seems that though the pair information contained in g2 
of the crystalline matter would determine the structures to high accuracy, it is not the case 
for disordered structures. 

The reason can be easily seen if we consider the © space. For an ordered point config- 
uration (i.e., a lattice), there are strong dependencies among the distances besides those 
required by the feasibility conditions. For example, consider a ci-dimensional Bravais lattice 
whose basis vectors are ai, a2, . . . , a^^. The vector connecting any two points in the lattice 
can then be expressed as 



where rij [i = 1, . . . ,d) are integers. Thus, the distance d between any two lattice points are 
given by 



where (, ) denotes the inner product of two vectors. Note that (aj, a^) = ^((a^, a^) -|- (a^, a^) — 
(aj— a^, aj— a^)). Thus, Eq. (fT7|) implies that every distance of an ordered point configuration 
can be obtained if the lengths of the basis lattice vectors and the distances between the 
end points of different basis lattice vectors are specified. In other words, Eq. ( JT7j) further 
reduces the free dimensions of the D space of the ordered point configuration in M.'^ to 
f = d{d+ l)/2. The additional conditions given by Eq. f|T7j) significantly reduce the number 
of feasible permutations of the distances. A unique feature of the distances for lattices 
is that the basis lattice vectors are associated with the smallest distances. To completely 



d = riiSii + n2a2 H h n^a^. 



(16) 



n 



n 




(17) 
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reconstruct the lattice configuration, the smallest d{d + l)/2 distances are selected to be 
assembled into a simplex in M°' defined by the common origin and the end points of all 
the lattice vectors, which in turn determines the fundamental cell of the lattice. In M^, 
three feasible distances uniquely determine a triangle and thus, the rhombical fundamental 
cell. In M^, there are maximally 30 ways that the 6 distances could be assembled into a 
tetrahedron. However, even for a two-fold degeneracy, the number of equality constraints 
introduced by Eq. f|T7|) (i.e., for two permutations of the 6 distances, Eq. f|T71) should hold 
simultaneously) is much larger than the free dimensions of the system, which generally rules 
out all non-trivial solutions. Indeed, it is known that for d < 3, pair distances are sufficient 
to uniquely determine Bravais lattices. However, in high dimensions, degeneracies of Bravais 
lattices can be constructed [9|. 

For a disordered structure, Eq. f|T7|) does not hold and the values of distances would form 
a continuous spectrum in the infinite volume limit. We consider the idealized case that 
there are a finite number of well defined distances and try to reconstruct the configuration 
from them. An important point is that no matter how carefully experiments might be 
carried out, there would still be small but finite errors associated with the distances, i.e., 
di = di + e.j, where di denotes the real value of the distance and denotes the error. Thus, 
instead of a single point in the D space, the distances correspond to a small uncertainty 
region with same dimensions as the D space. As we have pointed out, the presence of 
degeneracies corresponds to a feasible region with reduced free dimension, and thus has 
vanishing "volume" compared with the feasible region free of degeneracies, which leads us 
to the conclusion that degeneracies are rare in general. However, due to the uncertainties of 
the measured distances, we see that the feasible regions now are "finite" in size compared 
with those free of degeneracies. This explains why in the reconstructions it is hard to 
exactly recover the target configurations, i.e., all configurations associated with the distances 
corresponding to the points in the feasible region should be considered with equal probability 
for a "fair" reconstruction procedure. 



B. Decorrelation Principle 



Recently, Torquato and Stillinger 36|] proposed a decorrelation principle concerning the 
disordered hard-sphere packings in high dimensional Euclidean space M'^. In particular, the 
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decorrelation principle states that unconstrained spatial correlations vanish for disordered 
packings as the spatial dimension becomes large. In other words, as d increases, the short- 
ranged order beyond contact that exists in low dimensions must diminish. This principle 
has been explicitly observed in a variety of disordered packings in high dimensions 3^. 

The centroids of the hard spheres completely determine a packing, which can be consid- 
ered as a point configuration in in which there is a minimum value of pair separation 
distances D (i.e., the diameter of the spheres) due to the nonoverlapping condition. The 
decorrelation principle amounts to the following statement concerning the D space of the 
configuration: the requirement that the distances can not be smaller than D does not affect 
the occurrence frequency of distances with values greater than D in very high dimensions. 
Note the above should be true for any disordered packings, including both dilute and jammed 
packings. It is known that in low dimensions, (72 = H{r — D) can only be maintained for 
packings with densities less than a critical value [26|] and for disordered jammed packings 
g2 shows strong short-ranged oscillations Q], which is the manifestation of local spatial 
correlations due to the nonoverlapping constraint. In other words, for the jammed disor- 
dered packings, the requirement that a desired number of distances of value D must be 
realized in the configuration strongly constrains the possible values of other distances in low 
dimensions, especially those on the same magnitude oi D. In high dimensions, the above 
requirement becomes less significant in determining the local arrangements of points. Con- 
sider the construction used in deriving the feasibility conditions, to completely determine 
the position of a point in M'', a "reference" structure containing at least d points is used. 
The positions of the points in the reference structure can be chosen almost freely subject 
to the mild constraint that no two points can be closer than D. As d increases, larger local 
structures (containing more points) can be constructed before the constraints on the sepa- 
ration distances between the points begin to play an important role. In additions, there are 
{d — m) ways to arrange a point that has fixed distances to m (m < d) points in R''. Thus, 
as d ^ 00, the constraints on the pair-distance values imposed by the requirement that a 
desired number of distances with value D must be realized become insignificant, which is 
consistent with the decorrelation principle. 
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C. Additional Structural Information 



As we have shown, pair-distance statistics in general is not sufficient information to com- 
pletely determine the point configuration. A natural question is what additional information 
could be used to further reduce the compatible configurations associated with identical radial 
distribution functions. A conventional choice is the three-body correlation function jsl, ^ , 
which provides information how the pair distances should be linked into triangles. Though 
in certain circumstances could provide additional information on the point configuration, 
its determination requires additional effort to obtain either theoretically or computationally. 

It has been suggested in Ref. 4l| that instead of incorporating information contained 
higher-order versions of g2, namely, g^, g^, etc., one might be better served to seek other 
descriptors at the two-point level, which can be both manageably measured and yet refiect 
nontrivial higher-order structural information. One such quantity is the pair-connectedness 
function P2 [2j, (i.e., the connectedness contribution to g2), which contains non-trivial topo- 
logical connectedness information of the point configuration. Note the "connectedness" in 
a point configuration can be defined in many ways, e.g., one could circumscribe spheres 
sround each of the points and then define that two points are connected if the two associ- 
ated spheres are either contacting or overlapping, for example. Connectedness information 
contained in P2 is distinct from the "triangular" information embodied in g^, e.g., P2 is 
sensitive to clustering effects, whereas (73 is not. 



D. Generalization to Two-Phase Media 



As pointed out in Sec. I, two-phase media can be constructed by decorating point configu- 
rations. For example, one can construct sphere packings by assigning to each point a sphere 
centered at the point with diameter equal to the minimal distance in the configuration. In 
this sense, two-phase media are more general than point configurations. The degeneracy 
of discrete point configurations implies the existence of degenerate two-phase media. The 
corresponding pair-distance information for two-phase media is the two-point correlation 
functions 5*2 [2]. The degeneracy of two-phase media and the non-uniqueness issue of their 
reconstruction will be discussed in a sequel (Part II). Here we only provide an example of a 
two-fold degenerate two-phase medium constructed from the "kite-trapezoid" example given 
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(a) (b) 

FIG. 9: (color online). An example of two-fold degenerate continuum media based on the "kite- 
trapezoid" example given in Sec. I. The longest distance in the "kite" and "trapezoid" is symmet- 
rically placed on the large circle diameter. 

in Sec. I. 

As shown in Fig. [9], suppose we have two large solid circles in which small circular holes 
are made. One large circle contains the "kite" holes and the other contains the "trapezoid" 
holes. Since initially the two solid large circles are characterized by identical infinite distance 
set and the same subset of distances are then removed to make the holes, the remaining sets 
of distances for the two large circles with holes are still identical. 

V. CONCLUDING REMARKS 

In this paper, we discussed various aspects of the geometrical ambiguity of pair distance 
statistics associated with general point configurations in R*^. In particular, we introduced the 
idea of the D space and derived the feasibility conditions of the distances which are equivalent 
to the realizability conditions of g2 and the necessary conditions for degeneracy. We applied 
the conditions to construct explicit examples of degenerate point configurations and showed 
that though degeneracies are rare, one could not exclude their existence merely based on 
numerical reconstruction studies. We also applied the D space to problems in statistical 
physics, such as the reconstruction of atomic structures from experimentally obtained g2, 
and the decorrelation principle. 

As pointed out in Sec. IV. C, the degeneracy of point configurations implies the existence 
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of degenerate random media and a simple example is provided there. In a sequel to this 
paper js^], we will study the structural degeneracy of general random media and the non- 
uniqueness issue in the reconstruction of heterogeneous materials 
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